Learning Mid-level Words on Riemannian Manifold for Action Recognition
نویسندگان
چکیده
Human action recognition remains a challenging task due to the various sources of video data and large intraclass variations. It thus becomes one of the key issues in recent research to explore effective and robust representation to handle such challenges. In this paper, we propose a novel representation approach by constructing mid-level words in videos and encoding them on Riemannian manifold. Specifically, we first conduct a global alignment on the densely extracted low-level features to build a bank of corresponding feature groups, each of which can be statistically modeled as a mid-level word lying on some specific Riemannian manifold. Based on these mid-level words, we construct intrinsic Riemannian codebooks by employing K-Karcher-means clustering and Riemannian Gaussian Mixture Model, and consequently extend the Riemannian manifold version of three well studied encoding methods in Euclidean space, i.e. Bag of Visual Words (BoVW), Vector of Locally Aggregated Descriptors (VLAD), and Fisher Vector (FV), to obtain the final action video representations. Our method is evaluated in two tasks on four popular realistic datasets: action recognition on YouTube, UCF50, HMDB51 databases, and action similarity labeling on ASLAN database. In all cases, the reported results achieve very competitive performance with those most recent state-of-the-art works.
منابع مشابه
ACTION OF SEMISIMPLE ISOMERY GROUPS ON SOME RIEMANNIAN MANIFOLDS OF NONPOSITIVE CURVATURE
A manifold with a smooth action of a Lie group G is called G-manifold. In this paper we consider a complete Riemannian manifold M with the action of a closed and connected Lie subgroup G of the isometries. The dimension of the orbit space is called the cohomogeneity of the action. Manifolds having actions of cohomogeneity zero are called homogeneous. A classic theorem about Riemannian manifolds...
متن کاملON THE LIFTS OF SEMI-RIEMANNIAN METRICS
In this paper, we extend Sasaki metric for tangent bundle of a Riemannian manifold and Sasaki-Mok metric for the frame bundle of a Riemannian manifold [I] to the case of a semi-Riemannian vector bundle over a semi- Riemannian manifold. In fact, if E is a semi-Riemannian vector bundle over a semi-Riemannian manifold M, then by using an arbitrary (linear) connection on E, we can make E, as a...
متن کاملA Geometry Preserving Kernel over Riemannian Manifolds
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...
متن کاملLatent semantic learning with structured sparse representation for human action recognition
This paper proposes a novel latent semantic learning method for extracting high-level latent semantics from a large vocabulary of abundant mid-level features (i.e. visual keywords) with structured sparse representation, which can help to bridge the semantic gap in the challenging task of human action recognition. To discover the manifold structure of mid-level features, we develop a graph-based...
متن کاملLearning semantic features for action recognition via diffusion maps
Efficient modeling of actions is critical for recognizing human actions. Recently, bag of video words (BoVW) representation, in which features computed around spatiotemporal interest points are quantized into video words based on their appearance similarity, has been widely and successfully explored. The performance of this representation however, is highly sensitive to two main factors: the gr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1511.04808 شماره
صفحات -
تاریخ انتشار 2015